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^ ■ Abstract 

' We establish a lower bound of n{^/n) on the bounded-error quantum query complexity of 

, read-once Boolean functions, providing evidence for the conjecture that £)(/)) is a lower 

bound for all Boolean functions. Our technique extends a result of Ambainis, based on the 
, idea that successful computation of a function requires "decoherence" of initially coherently 

^3 ' superposed inputs in the query register, having different values of the function. The number 

of queries is bounded by comparing the required total amount of decoherence of a judiciously 
^ ' selected set of input-output pairs to an upper bound on the amount achievable in a single 

query step. We use an extension of this result to general weights on input pairs, and general 
^ ' superpositions of inputs. 

a ■ 

5T; 1 Introduction and summary of results 



In the quantum query model of computation, a query register containing a string x of n bits is 
;h ' accessed by a quantum computer via queries. In each query, the computer may ask for a single 

^ bit z of the query register, and the value Xi of that bit is returned; queries are quantum coherent, 

which means that a computer may superpose different query requests i with complex amplitudes 
Q!j , and is returned a superposition of the corresponding bit values Xi . 

The quantum query model is the quantum analog to the classical boolean decision tree model, 
and is at least as powerful as the classical model. It is of great interest to compare computation in 
these two models, and to see the extent to which quantum computation gives an advantage over 
classical deterministic and randomized computation in this setting. One of the major algorithmic 
results in quantum computation is Grover's search algorithm Q, which can be viewed as a quantum 
algorithm for computing the n-bit OR function with 0{^/n) queries. This compares with the n 
queries required for deterministic decision trees and the i}(n) queries required by classical random- 
ized trees. This can be used to speed up brute-force search for solutions to problems (e.g. in NP) 

*This work was done in part while the author was visiting DIMACS and was supported in part by NSF under 
grants EIA 00-80234 and 99-06105. 

■I^Research supported by NSF grants CCR9988526, EIA 00-80234 and 99-06105. 
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with polynomially-checkable solutions, ^{^/n) is known to be a lower bound for 0R[^ perhaps 
our best piece of evidence that BQP C NP. 

There are two major variants of the quantum query model: the exact model and the bounded 
error model. In the exact model, we require that the quantum computation always output the 
correct answer, and in the bounded error model we allow that on any input, the computation may 
have a small probability e of being incorrect. We write QeH) for the quantum complexity of / in 
the exact model, and Qe{f) for the quantum complexity of / in the bounded error model, where 
e is the permissible error. (It is well known that for e S (0, 1/2) the value of e only affects Qeif) 
within a constant factor.) We also write D(f) for the determinstic decision tree complexity of /. 

In the exact model, there are examples of surprising speedups, for example the 2 bit XOR can 
be done exactly with one quantum query, but there are no known examples where exact quantum 
computation provides more than a constant factor speedup over deterministic decision trees. 

In the bounded error model, the OR function provides an example where quantum computation 
gives a significant speedup over deterministic (and randomized) decision trees. In fact, the quadratic 
speedup for OR is the best speedup result known for any boolean function. Perhaps the most 
important problem in quantum query complexity is to resolve the following conjecture (which 
seems to have been suggested by several researchers): 

Conjecture 1 For any boolean function f and e G (0, 1/2), Qe{f) = ^{D{f)^^'^). 

The best known result of this type says that for any / Qe{f) = ^{D{f)^^^) (This result appears 
in the survey article 0| and is an improvement on an earlier fl(D{f)^^^) bound in |^], which is 
obtained by combining the arguments of |^] and an improvement, due to Nisan and Smolensky, of 
a result of Nisan and Szegedy|^].) It should be remarked that the conjecture is for functions whose 
domain is all of {0, l}"; for functions whose domain is restricted (promise problems) there are much 
better speedups known, see e.g. In fact, the main component of Shor's factoring algorithm 

1^] is a query algorithm for the promise problem of finding the period of a function by querying 
its table of values(cf. |lO|). 

The main result of this paper is to prove the conjecture for the class of read-once functions, 
those functions expressible by a boolean formula in which each variable appears at most once. 
Our results provide a quantum counterpart to the lower bounds on the randomized decision tree 
complexity of read-once functions given in ||Tl[| and |jl^. 

In ||l3[, Ambainis introduced a lower bound technique for the quantum query model. He applied 
this technique to obtain a ^}{^yn) bound for a particular read-once function, the function which is 
an OR of ^/n disjoint ANDs of size ^/n. 

Our method for obtaining the ^/n result generalizes Ambainis' method; in Section ^ we give 
a generalization of his technique. Ambainis' approach is based on a thought-experiment in which 
we imagine the computer to operate on a superposition of inputs in the query register. The idea 
is that successful computation of a function requires, in the thought-experiment, "decoherence" of 
initially coherently superposed inputs in the query register, having different values of the function. 
This is because successful computation must correlate input states having different values of the 
function with nearly orthogonal states in the part of the computer where the result is to be read. In 
Ambainis' main results, the inputs are, essentially, taken to be superposed with equal coefficients. 
The number of queries is bounded by comparing the required total amount of decoherence of a 
judiciously selected set of input-output pairs to an upper bound on the amount achievable in a 
single query step. Our result generalizes this technique to give a corresponding result using the 
weighted total decoherence of input pairs (rather than just including/excluding pairs via weights 
equal to zero or one), and general superpositions of inputs rather than uniform ones. We anticipate 
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that this result and this approach will prove useful well beyond the context of read-once functions 
to which we apply it in this paper. 

2 Quantum query complexity 

In any quantum computation model, we think of the memory of the machine as composed of 
registers, where each register has a set of allowed values. A memory configuration is an assignment 
of values to registers. 

Each register R is associated to a complex vector space Hr whose dimension is equal to the 
number of allowed values of the register. has a distinguished orthogonal basis whose members 
are in one-to-one correspondence with the possible values of the registers. We use the Dirac or 
"bra-ket" notation for complex vector spaces: elements of such a space are denoted by the notation 
14)), and viewed as complex column vectors. For such a vector (0| denotes the dual row vector whose 
coordinates are the complex conjugates of those of \(f)). The notation {0\'ip) denotes the (complex) 
inner product of \(p) and The standard basis vectors of Hr are denoted by \v) where v is an 
allowed value for R. 

A group of registers can be viewed together as a single virtual register. If Ri,... ,Rk are 
registers and R is the virtual register obtained by combining them, then the value set of R is the 
product of the value sets of the Ri and the space Hr is naturally isomorphic to the tensor product 
-fffli (8) • • • (8> Hr^. li vi,. .. ,Vk are possible values for Ri,. . . ,Rk, then \vi,... ,Vk) is a standard 
basis element of Hr and is identified with \vi) (gi • • • (g) \vk) which is also written |fi)|f2) • • • \vk)- 

In particular, the entire memory can be considered as a virtual register in this way and is 
associated to a complex vector space H. The quantum states of this memory are unit vectors in H. 

Now let's consider the quantum query model. Here the memory is viewed as divided into three 
registers: the input register which holds an n -|- 1 bit string xo,xi,. . . ,Xn where xo is fixed to 0, 
the query register which holds an integer between and n, and the auxiliary memory which has 
no restrictions on its value set. The query register and auxiliary memory together comprise the 
working memory. The standard basis of the associated complex vector space H consists of vectors 
of the form \x, i, z) in which the input string is x, the query is i, and the auxiliary memory is set 
to z. Thus, a state of the computer is represented as: 

^ ^ ^x,i,z\Xi i, z), 
x,i,z 

where for each memory assignment x,i,z, ax,i,z is a complex number, and Ylxiz \^x,i,z\'^ = 1 

The space H can be viewed as the tensor product of two spaces Hjn ® Hw where the input 
space Hjn is spanned by the 2" basis vectors |a:) corresponding to inputs, and the work space 
is spanned by vectors \q,z) corresponding to possible contents of the working memory. The space 
Hw is further decomposed as Hq Ha where the query space Hq is spanned by the n + 1 query 
values \q) and the auxiliary space Ha is spanned by the assignments l^) to the auxiliary space. Thus 
\x,i,z) is identified with the tensor products |a:) \i,z) = \x)\i,z) and \x) fS> \i) tX" 1^) = |a;)|f)|2;). 

Each computation step is a unitary operator on this vector space. In the query model, there 
are two types of operators allowed. A work space transformation is one that operates only on the 
work space, which means it is of the form 7/„ (g) A where //„ is the identity operator on Hjn and A 
acts arbitrarily on Hw- The unitary operator O, called the oracle, operates as follows: 

0\x,i,z) = i-ir^\x,i,z) (1) 
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An algorithm is specified by (1) an arbitrary sequence Ui, . . . ,Ut of work space operators and 
(2) a pair of orthogonal projectors Pq on the space Hw, i.e., a pair of linear maps satisfying 

(Po)' = ^'o, P? = Pi and Po + ^'i = Iw- 

An algorithm is executed as follows. The memory is initialized in the basis state with the input 
register set to the input x and all other registers set to 0. Then the sequence Ui, O, U2, O, . . . ,Ut,0 
is applied to the computer. For / G {1, . . . ,t} the pair Ui,0 is called the step of the computation. 
Observe that the operations Ui and O leave the input register unchanged. Formally this means 
that the state of the computer is always of the form \x) (gi |^') where x is the input and |^') is a 
vector of (generally not a standard basis state). 

The output of the computation is either or 1, determined according to the following probability 
distribution. If the final state of the computation is |^') then the computation outputs j with 
probability equal to llPjl^*)!!^. (Note that the definition of Pq and Pi guarantees that the vectors 
PqI^*) and Pi|^) are orthogonal and sum to a unit vector, which implies that the two probabilities 
sum to 1.) The process which generates this distribution is called a measurement. 

Variants of this model have been considered; in particular the oracle O can be replaced by a 
more general transformation that transforms the work space depending on the value of the input 
bit indexed by the query register. It is well known that this generality can not speed up the 
computation by more than a factor of 2. 

The complexity of the algorithm is measured by the number of calls t to the oracle. In the 
bounded-error quantum query model, we fix some e < 1/2 and a computation is considered to 
successfully compute / if it e- computes /, which means that for every input, the probability that 
the algorithm gives the wrong answer for that input, is no greater than e. The e-error quantum 
query complexity of /, denoted Qe{f) is the minimum number of steps in an algorithm that e- 
computes /. It is well known that the choice of e G (0, 1/2) only affects the complexity up to a 
constant factor. 

To avoid confusion, we note that a different way of phrasing quantum query complexity problems 
is sometimes used: the oracle is said to be a "black box function," g. This g is what we are calling 
the "input," (and using the letters x ox y for); it is not the function / being computed. The black 
box terminology sometimes calls our / something like P, and P is said to be a property of the black 
box function g. This model does not represent the input 5 as a state of a register in the computer. 
Rather, the computer is our computer minus the input register, and a query step is an application 
of the unitary Og to the computer state. Generally, Og is viewed as acting on two registers, an 
"input" register which is homologous to what we have called the query register, and an output 
register. Its action in the standard basis is to compute g of the state in the input register, and 
write it (in modular arithmetic to ensure unitarity) in the output register, while keeping the input 
around. (An alternative phase version of the query unitary, similar to (0), is sometimes used in 
this picture, too.) We mention this approach primarily to forestall any confusion that could arise 
because the terms "function" and "input" may used for different things on this approach than on 
the one we have adopted. 

3 A general lower bound on quantum query complexity of Boolean 
functions 

In this section, we present a general extension of Ambainis' lower bound approach. 

Let / be an n-variate boolean function whose query complexity we want to lower bound. The 
lower bound is expressed in terms of a complex vector \a) of length 2" indexed by inputs (so it is 
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a member of Hjn) and a 2" x 2" nonnegative real matrix T indexed by pairs of inputs, satisfying 
Txy = if f{x) = f{y). For such a matrix F, for each i G {1, ... , n} we define for x G {0, 1}"": 

^X,i ^ ^ Fj;^, (2) 

the total weight of inputs differing from x on variable i. Further, for i G {1, ... , n} and b G {0, 1} 
we define: 



J/ 

The main result of this section is: 

Theorem 1 Let f be an n-variate boolean function. Let \a) be a nonnegative real valued vector 
indexed by {0, 1}" and T be a nonnegative real matrix indexed by {0, 1}" x {0, 1}" satisfying Tx,y = 
whenever f{x) = or f{y) = 1. // there is a quantum algorithm that e-computes f using t queries, 
then 

^ ^ (a|F|a)(l- 27^(1^) ^ ^ / («|F|a) \ 

\ J ' 

Buhrman and Szegedy (personal communication) have independently obtained a similar result. 

This should be compared to Theorem 6 of [|l3|. In this theorem, Ambainis gives a lower bound 
which can be obtained from the above theorem by letting F be a 0-1 matrix and letting \a) be a 0-1 
vector. Thus |a) is the characteristic function of some subset Z of inputs and F is the characteristic 
function of some relation R on f~^{l) x /"^O). (Actually, if we define X = Z n f~^{l) and 
Y = Z D f^^{l), Ambainis' defines the vector |a) to have = 1/ y^|X| for z E X and 1/Y^|y| 
for z G Y, but this normalization does not affect the bounds.) This choice of F and a leads to 
simplifications of both the numerator and denominator. 

When specialized as above, the denominator in our expression reduces to the denominator in 
Ambainis' theorem 6. Ambainis defines lx,i to be the number of y G y such that R(x, y) and Xi ^ yi, 
and ly^i similarly. Our parameter v specializes to Ambainis' parameter Imax which is defined as the 
maximum of the product lx,ily,i, over ones x, zeroes y, and query indices i. 

Ambainis also defines m(m') as the minimum over x £ X {y £ Y) the number of y' £ Y 
{x' G X) such that R{x,y') {R{y,x')), which bounds the numerator from below. Then Ambainis' 
Theorem 6 says that the complexity is Cl{y/mm' /Imax)- 

In the application to read-once functions in the next section, we will not need the full generality 
of Theorem ||. In particular, as Ambainis does, we restrict F to be the characteristic function of a 
relation. On the other hand, the nonuniformity of the coefficients a will be crucial to our results. 

The remainder of this section is devoted to a proof of Theorem ^. The proof is a more or less 
straightforward generalization of Ambainis' bound. 

Let Ui, . . . , C/j be a sequence of work space operators and Po,Pi be a pair of orthogonal pro- 
jectors on Hy\r that specify an algorithm. Once we fix an algorithm, then on input x, the state of 
the computation after j steps is of the form \x)\^xij)) , where \^xij)) £ H^y. (We will normally 
suppress the index j). Let us consider the set of vectors {{"^x = "^xit)) : x G {0,1}"} after t 
computational steps, but before the final measurement. 



max Ux^i 

x:f{x)=b 



je{i,...,n} 
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Proposition 1 (Ambainis) For a computation to e-compute f, it is necessary that for any x, y 
such that f{x) = 1, f{y) = 0, 



K*.|^,)| < 2^/^(1^. (4) 
Proof: If the computation e-computes /, then 



\Po\^M^ =-Vx<e, 

|2 



Pi\^y)\\'=:riy<e. (5) 



Now, 



< \j{^,\Pl\^,){^y\Pl\^y) + j{^,\P0\^^){^y\P0\^f 



(1 - V.)Vy + V^x.(l - Vy) < 2\/e(l - e) , (6) 

by the Schwarz inequaUty and (|5|). ■ 
We remark that these necessary conditions are not sufficient [^]. 

Define M to be the matrix with elements Mxy = \{^x\^y)\- (When we want to exphcitly 
consider the situation at step I, we write M(Z) for the matrix with elements \{^xil)\^yi^))\-) It is 
useful to group the inputs according to whether f{x) = or 1, and view the matrix as a two- by- two 
matrix of block structure Mq^, Mo,i, Mi^, Mi^i given by this grouping. Proposition |l| involves 
only the off-diagonal block, say, Mi^o- The general approach of Ambainis involves looking at how 
much a single query can decrease the matrix elements of this off-diagonal block. Of course, many 
matrix elements must be considered at once, because any individual matrix element Mxy can be 
brought down to zero by a single query to any bit i for which Xi ^ yi. In fact, such a query will 
reduce to zero all Mxy such that Xj 7^ yi. However, such a query will fail to have any impact on 
matrix elements for which Xj = yi- There is thus a tradeoff between various sets of matrix elements. 
A successful deterministic classical algorithm must cannily choose i's, depending on the results of 
previous queries, such that each query distinguishes many inputs that were not distinguished by 
previous ones. For a probabilistic classical algorithm, at each query probability may be distributed 
between the indices i; and in a quantum algorithm, complex amplitude rather than probability 
is distributed over the query indices. But in each case there is a tradeoff: more probability, or 
more amplitude, on a query that distinguishes one set of input pairs, can reduce the probability, 
or amplitude, on queries distinguishing another set. 

In order to incorporate such tradeoffs while providing a necessary condition for e-computation 
less complicated than the full set of conditions implied by Proposition [l|, we might consider averaging 
all the off-diagonal-block matrix elements' moduli |-Mj;j^|. Since they must all drop below k := 

e(l — e), so must their average. In fact, we may consider any desired positive weighted sum 5 of 
the off-diagonal-block matrix elements, ^x y l^^yl^^yl- -^^^ reasons that are still a bit mysterious 
to us, it turns out that it is useful to express the weight fj-xy = ^xyO-xO-y, where F is a nonnegative 
real matrix and \a) is a unit vector with nonnegative entries. On the face of it, this more complex 
expression provides no additional generality, but it provides additional fiexibility in the analysis. 
The vector \ot) can be interpreted, as Ambainis does, as an initial superposition of inputs in the 
query register. 

As an immediate consequence of Proposition |l] we have: 
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Proposition 2 Let f be an n-variate boolean function, and let A be a t-step quantum query algo- 
rithm that attempts to compute f . Let \a) be a unit vector indexed by {0, 1}" with nonnegative real 
entries and T be a matrix indexed by {0, 1}" x {0, 1}" with nonnegative real entries that satisfies 
ra:,j/ = if f{x) = f{y). If A e-computes f then: 

\M{t)^y\ < (a|r|a)v^2e(l -e) . (7) 

xy 

For I £ {0, . . . , t}, let us define 

Si = ^ r^yaa:ay\M{l)^y\. (8) 

xy 

Since M(0) is the all 1 matrix, Proposition ^ implies: 

Proposition 3 For a t-step computation to e-compute f, it must be the case that 

So-St> ^r,j,|Q,||M,j,||a3,|(l - 2^6(1 - e)) ■ (9) 

xy 

We will now get a lower bound on t by upper bounding Si — Si^i, the amount that the sum can 
decrease as the result of a single query. 

Proposition 4 

Si - Si+i < 2^ . (10) 

The proof of this proposition appears in an Appendix. 

If we multiply this upper bound on decrease per query by t, this must exceed the difference 
Sq — St- This together with Proposition |^ completes the proof of the theorem. 

4 Read-once Boolean functions 

A read-once Boolean function is one which can be written as a formula in propositional logic, 
involving each variable Xi (each bit of the input string) only once. Each such function can be 
represented by an AND/OR tree. This is a rooted labeled tree having n leaves, each corresponding 
to a different variable (with some possibly negated), and where each internal node is labeled either 
AND or OR. Each AND (resp., OR) node in the tree is associated to a function which is defined 
recursively as the AND (resp. OR) of the functions computed by its children. 

Without loss of generality, we may assume that all of the children of an AND node are OR 
nodes, and vice versa. Also, we restrict attention to monotone functions, which are those whose 
leaves are all nonnegated variables, since the query complexity of the function is preserved under 
negation of variables. 

In this section, we use the convention that the variable x indicates zeroes of the function, and 
the variables y and z indicate ones of the function. 

Theorem 2 Q,{^/n) is a lower bound on the bounded-error quantum query complexity of all read- 
once Boolean functions. 
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We outline the proof technique before providing details. 

We will apply Theorem ||. For this we need to define the matrix T and the vector \a). We 
identify a subset of {0, 1}" called critical inputs. These are, intuitively, the inputs on which / is 
hardest to compute. (The same notion of critical input plays a similar role in the lower bound 
proofs for the randomized query complexity of read-once functions |12l)- 

We also define what it means for two critical inputs x G /~^(0) and y £ to be neighbors; 

intuitively these are pairs of inputs that are hard to distinguish. We define the matrix T to be the 
characteristic function of the neighbor relation on the set of critical inputs. Given these choices, it 
will turn out from the definition of critical inputs that is always one. 

The main work of the proof comes in choosing the vector \a). We look for a choice of the vector 
I a) that maximizes the expression in the lower bound of Theorem || (given our particular choice of 
r). This (continuous) maximization problem is formulated using Lagrange multipliers, and gives 
rise to a set of first-order conditions. We then construct \a) that satisfies the first order conditions. 

This solution is constructed inductively. Assume that the root is an AND and has r children 
and for i G {1, ... , r} let denote the function computed at child i. (The case that the root is an 
OR involves only obvious minor alterations.) We write rii for the number of (boolean) variables in 
gi. Thus n := Yll=i number of (boolean) variables in /. Assume that we have determined 

|a') for each of the gi. We construct \a) in terms of these. Further we show that if |a*) gives a 
bound of K^/nl for each of the gi, then \a) gives a bound of K^/n for /. 

We proceed with the detailed proof. We have the read-once function / represented by tree T, 
and express f as g^ A. . .Ag^ where are the functions computed at the children of the root labeled 
by AND. 

In choosing a F and |a) for applying Theorem ||, we will focus our attention on critical inputs. 
An input is critical if for each AND node, at most one child evaluates to and for each OR node, 
at most one child evaluates to 1. A critical input in /~^(1) is a critical one and a critical input in 
f~^{0) is a critical zero. 

We write X (resp. Y) for the set of critical zeros (resp., critical ones) of /. For i E {1, ... ,r} 
we write X* (resp., y*) for the set of critical zeros (resp., critical ones) of g^ . We use the letter x to 
denote an element of X, the letters y and z to denote elements of Y . Also denotes an element 
of X* and and denote elements of Y^ . 

Observe that since the root is an AND, a critical one y may be written in the form y = y^ . . . y'', 
where for each j, y^ is a critical one of gK For a critical zero x, exactly one of the children of the 
root evaluates to 0. We say that x is of type i, for i £ {1, . . . ,r}, if g^ evaluates to 0. A critical 
zero X of type i may be written in the form x = z^ . . . z^~^x^z^~^^ . . . z^ , where is a critical zero 
of gi and for j ^ i, z^ is a critical one of g^ . 

Let X G X and y £ Y and let i be the type of x. We say that y = y^ . . .y^ and x = 
z^ . . . . . . z^ , are neighbors provided that z^ = y^ for j ^ i and and are neighbors 

(defined recursively). We denote by R the neighbor relation on X x y and Ri the neighbor relation 
on Xj X 1^. It is easy to see that two neighbors differ on exactly one input variable and consequently, 
for any critical input w and any j G {!,... , n}, w has at most one neighbor that differs from it on 
variable j. 

When we apply Theorem |^ we take F to be the characteristic function of the relation R. It is 
easily seen that for this F, the parameter u appearing in the denominator in Theorem |l| is just 1: 
by the last sentence of the previous paragraph, the quantity Vujj is at most 1 for any critical input 
w and j S {1, . . . , n}. 

Having fixed F, we now want to choose Without loss of generality we will take the coordi- 
nates of I a) to be nonnegative real numbers and assume that they are zero outside of X U y. We 
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look for an \a) such that the lower bound expression in Theorem || is maximum. This means we 
want to solve: 



max OxOy 
S.T. Y,al + ^al = l (11) 

X y 

From Lagrange multiplier optimization for the above problem, we get the first-order conditions 
(FOCs): 

Ox C ^ ^ CXy , 

y:{x,y)(^R 

ay = C ^ ax . (12) 

x:(x,y)(^R 

Here C is a constant to be determined. 

Suppose we find C and a unit vector |a) satisfying (p^. If we multiply the first FOC by and 
sum on X G X we get that the objective function is equal to ^ Ylix&x '^x- Similarly if we multiply 
the second FOC by Uy and sum on y G y we have that the objective function equals ^ X^^gy o.y- 
This implies that Ylx&x '^x = Z]j/ey '^y ~ k value of the objective function is We will 

prove: 

Lemma 1 There is a nonnegative real unit vector \a) satisfying ( j7^ with C = Xj^fn. 

Theorem ^ now follows immediately from the lemma and Theorem |^ 

The proof of the lemma is by induction. For the base case, we take / to be the univariate 
functions f{xi) = xi. For this function X = {0}, y = {1} and oq = ai = solves the FOC with 
C = 1. For the induction step, we assume that the lemma holds for each of the functions gi and 
prove that it holds for /. 

Let X = . . . • • • y*" be an element of X. All neighbors y oi x must have a critical 

one y* in the i-th place that is a neighbor of x*, while agreeing with x in the other places, and thus 
be of the form: 

y\..y^...y^ . (13) 

So, 

O^X = OLyl yV = C ^ OLyl yi yV. {l^) 

y':ix\yi)&R, 



Similarly, for y = y^ . . .y^ £ Y, 



(Xy = (Xyl yr 



i x':(x^,y^)eRi 
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By the induction hypothesis, for each i S {1, • • • ,r}, we have a unit vector a* that satisfies the 
first-order conditions for g^: 

x':iJi(x*,j/*) 

with Cj = -A=. 

We proceed to estabhsh the induction step. We guess that the weights at the top level are the 
product of the weights at the next level down, up to a constant which can depend on the type of 
the input whose weight we are computing. (Here, only the x's have distinct types, depending on 
which of the gi has the value zero.) 

ay = Attyi yr = Oiyiay2 ■ ■ ■ OLyr (18) 
ax = oiyi x\..y^ = -640^10^2 . . . a^i . . . ayT . (19) 

We check these guesses by plugging them into both sides of ( [T^ ) and ([l5|), respectively, obtaining: 
{ax =)Biayi,,,xK..yr = CAa\^ . . . aJ^-_\ ... ^ aO a^+\ ...a^r , (20) 

\yi:{x\y^)&Ri ) 



{ay =)^aji . . . a^. = C E Sioji ...a^il | ^ at,, | a^tf\ . . . 

\x'-:Ri{x^,y'-) 



r 

ayr . 



(21) 



The parenthesized sums in these expressions evaluate to al^i/Ci and a'^i/Ci, respectively, from 
the lower level Lagrange FOCs (IC-IT). So our guess solves the higher level FOCs. In the equation 
(|20|) for Wx, this requires 



In the equation ( pl| ) for tf;^, we obtain: 

Wy = C'YBi^Wyi...Wyi...Wyr , 



(22) 



(23) 



and (since the weight product on the RHS is i-independent), this requires (substituting for Bi using 



A=cy^ ie ^ = y^ 



(24) 



Since Ci = 1/^/rii we deduce C = 1/ \/^~ni = l/^/n as required. 
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A Proof of Proposition g 

Without loss of generality for our purposes, we may take F to be symmetric, upper triangular, or 
lower triangular. We proceed to establish an upper bound on the magnitude of the decrease of the 
weighted sum (|8|) in a single query. It will be convenient to assume T is symmetric. Define |^'^) as 
the component \^x) having i in the query register, so that 
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\^,) = ^\^l). (25) 

i 

Then 

M = ^ (26) 

i 

where M* is defined via {M^)xy = (^'^l^'p. (Just write M^y as an inner product, use (p^), and 
note that the i / j terms are zero.) To reduce clutter, define p\.y := a*MJ^yay. In this notation 
we want to upper bound '^i^^y^^ylPxyl- consider the inner sum first. We first upper bound 
1/3^ y I as a hnear combination of p^^, and Pyy. For this purpose we introduce a nonnegative matrix 
P = Pxy, whose entries we will specify later. We have: 



\Pxy\ — \J PxxPyy ~ y Pxx ^ PxyPyy ^ 2^^^yP^^ ~^ jj PyV^ ' i'^'^) 

(The first inequality is due to the positivity of p*, which requires that the determinant of any 
principal minor be positive; the second is the arithmetic-geometric mean inequality). Then 



'^xy\Pxy\ 

xy.Xij^yi 

— ^xy-^iPxyPxx + 'f5i~Pyy^ ~ y^ ^xyPxyPxx ■ (2^) 



a; y.yi^Xi x y.y.^x. 

The last equality is just due to the symmetry under x ^ y. 
We now define 



/3i. = A/^' (29) 



where v^^i was defined at the beginning of this section. 
The last expression in (2S) becomes: 



Y^f^xx E ^^y\ —- (30) 

^ V ^xi 

Recalling the definition of v\ at the beginning of the section, we can bound this expression by: 



VI — 
\ < ^P'xx^VxiV, 

^ y-yifxi 



l-Xi 



X 

= Y^'cp'V^ 

X 

< Y^iip'^ 
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Summing on i then yields: 



E E ^^y\Ply\<V^ (31) 



This equation is an important lemma, which we use to establish our bound on the decrease of 
the weighted sum in a single query. For quantities which change in a query, we distinguish the 
post-query quantity by priming it. 



S' - S := ^I'xyilpxyl -\pxy\) = ^'^xy{\pxy - Pxy\) 
xy xy 

i xy 

We can bound the term in parentheses by noting that since is the input register density 
matrix relative to i in the query index the query multiplies density matrix elements pl,y by a factor 
leaving them unchanged if Xi = Ui- Thus we obtain 

Pxyl)^^^ ^xy\Pxy\ ~'^\P xy\- (33) 

i xy.Xif^yi i xy.Xij^yi 

We can then apply Eq. (31) to each of these terms, obtaining: 

S - S' < y/l^^{tY p + ti p) = 2^/1) . (34) 
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